A Bayesian Framework for Persistent Homology

An Introduction

Thomas Reinke

Baylor University

July 28, 2025

Contents

  1. Introduction
  2. Background
  3. Bayesian Framework
  4. Worked Example
  5. Classification
  6. References

Original Paper

A Bayesian Framework for Persistent Homology(Maroulas, Nasrin, and Oballe 2020)

Introduction

Real World Problem

  • Motivation
    • Detecting crystalline structures of materials HEAs(High Entropy Alloys)
  • Current method APT(Atom Probe Tomography)
    • 3D reconstruction of atoms
    • Drawbacks - experimental noise and abundance of missing data - 65% of atoms in a sample are not registered in typical experiment & spacial coordinates of atoms are corrupted by external noise.
    • Existing algorithms rely on symmetry arguments and cannot establish crystal lattice

Real World Problem: Body Centered Cubic

Real World Problem: Face Centered Cubic

TDA Motivation

  • Explore topological structure in datasets
    • Persistent homolgy, associate shapes to data & summarize features with persistence diagrams
  • Applicable to many areas
    • handwriting analysis, brain arteries, image analysis, neuroscience, sensor networks, protein structures, biology, dynamical systems, action recognition, signal analysis, chemistry, genetics, object data, etc.

TDA Motivation

  • Persistence Diagrams (PDs) for inference & classification problems
  • Model PDs as random sets(not vectors) - Poisson point processes - single parameter intensity
    • Derive closed form of posterior - relies on conjugate families of Gaussian mixtures

Background

Example Data

Persistence Diagrams

Fundamental geometric structures

  • Simplices: Basic shapes like points, line segments, triangles, tetrahedra.
  • Simplicial Complexes: Collections of simplices “glued” together in a structured way.

Simplices: Geometric Building Blocks

  • A \(k\)-simplex is the convex hull of \(k+1\) geometrically independent points (vertices) \(\{v_0, ..., v_k\}\).
    • Notation: \([v_0, ..., v_k] = \bigl\{ \sum_{i=0}^k \alpha_i v_i : \sum_{i=0}^k \alpha_i = 1, \alpha_i \ge 0 \bigr\}\)
  • Examples:
    • 0-simplex: \([v_0]\) (a point/vertex)
    • 1-simplex: \([v_0, v_1]\) (a line segment/edge)
    • 2-simplex: \([v_0, v_1, v_2]\) (a triangle, including interior)
    • 3-simplex: \([v_0, v_1, v_2, v_3]\) (a tetrahedron, including interior)
  • Faces: k-1 simplices spanned by subsets of the vertices. (e.g., edges of a triangle).

Simplicial Complexes

A Simplicial Complex \(S\) is a collection of simplices satisfying:

  1. Face Inclusion: If a simplex \(\xi \in S\), then all faces of \(\xi\) are also in \(S\).
  1. Consistent Intersection: The intersection of any two simplices \(\xi_1, \xi_2 \in S\) is either:
    • Empty (\(\emptyset\)), or
    • Contained in \(S\)
  • Intuition: A well-constructed mesh made of points, edges, triangles, etc.

Example

Vietoris-Rips

  • Goal: Build a simplicial complex from a point cloud \(X = \{x_i\}_{i=1}^L \subset \mathbb{R}^d\).
  • Vietoris-Rips Complex \(\mathcal{V}_r(X)\):
    • Parameter: Proximity radius \(r > 0\).
    • Rule: A simplex \([x_{i_1}, ..., x_{i_l}] \in \mathcal{V}_r(X)\) iff \(\text{diam} (x_{i_1}, ..., x_{i_l}) \leq r\)
  • Intuition: Connect groups of points that are close to each other.

VR Filtration

  • Given a nondecreasing sequence \(\{ r_n\} \in \mathbb{R}^+ \cup \{ 0\}\) with \(r_0 = 0\), we denote its Vietoris–Rips filtration by \(\{ \mathcal{V}_{r_n} (X)\}_{ n\in \mathbb{N}}\) .
  • A Vietoris-Rips Filtration is a sequence of nested simplicial complexes: \[ \mathcal{V}_{r_0}(X) \subseteq \mathcal{V}_{r_1}(X) \subseteq \mathcal{V}_{r_2}(X) \subseteq \dots \]
  • As we increase the radius \(r\), more points get connected, and higher-dimensional simplices appear.
  • This filtration tracks how the topological structure (connectivity, holes, voids) evolves as the scale \(r\) changes.

VR Visualization

Persistence Diagrams (PDs)

  • A Persistence Diagram (PD) \(\mathcal{D}\) is a multiset of points in the space \(\mathcal{W} = \mathbb{W} \times \{0, 1, ..., D-1\}\).
    • The birth-death plane is \(\mathbb{W} = \{ (b, d) \in \mathbb{R}^2 \mid d \ge b \ge 0 \}\).
  • Each point \((b, d, k) \in \mathcal{D}\) represents:
    • A homological feature of dimension \(k\).
    • The feature appears (is born) at filtration scale \(b\).
    • The feature disappears ( dies) at filtration scale \(d\).
  • Persistence: \(d-b\) (how long the feature “persists”). Longer persistence often indicates significant features.
  • Dimensions:
    • \(k=0\): Connected components.
    • \(k=1\): Loops or holes.
    • \(k=2\): Voids or cavities.

Persistence Diagrams

The Poisson Point Process (PPP)

  • General Idea: Finite Point Processes \((\{p_n\}, \{\mathbb{P}_n\})\) model the number of points (\(N \sim \{p_n\}\)) and their spatial distribution (\((x_1, \dots, x_n) \sim \mathbb{P}_n\)).
  • Poisson PP \(\Pi\): A specific, widely used PP governed by an Intensity Measure \(\Lambda\) on \(\mathbb{X}\). Let \(\mu = \Lambda(\mathbb{X})\).
    • Cardinality (\(N\)): The number of points \(N \sim \text{Poisson}(\mu)\).
    • Spatial Distribution (\(\mathbb{P}_n\)): Given \(N=n\) points, their locations are drawn independently from the normalized intensity measure \(\frac{\Lambda(\cdot)}{\mu}\). \[ \mathbb{P}_n(A_1 \times \dots \times A_n) = \prod_{i=1}^n \left( \frac{\Lambda(A_i)}{\mu} \right) \]
  • Key Property: The intensity \(\Lambda(A)\) is the expected number of points falling within a region \(A \in \mathcal{X}\): \[ \mathbb{E}[|\Pi \cap A|] = \Lambda(A) \]

Marked Poisson Point Processes

  • Motivation: Model points \(x \in \mathbb{X}\) that have associated attributes or ‘marks’ \(m\) from another space \(\mathbb{M}\). (e.g., PD points \(x\) with marks \(m\) describing feature properties).
  • Definition: A Marked PPP \(\Pi_M\) is a Point Process living on the product space \(\mathbb{X} \times \mathbb{M}\).
  • Key Components & Properties:
    • Relies on an underlying standard PPP on \(\mathbb{X}\) (intensity \(\Lambda\)) to place the points \(x\).
    • Uses a Stochastic Kernel \(\ell\): For any fixed location \(x \in \mathbb{X}\), \(\ell(x, \cdot)\) is a probability measure on the mark space \(\mathbb{M}\), defining the distribution \(P(\text{mark } | \text{ location } x)\).
    • Given points \(\{x_1, \dots, x_n\}\), their marks \(\{m_1, \dots, m_n\}\) are generated conditionally independently, with \(m_i\) drawn from the distribution \(\ell(x_i, \cdot)\).

Bayesian Framework

Setting the Stage: Representation & Goal

  • Tilted Representation: We work with the “tilted” PD representation \((b, d-b)\) instead of \((b, d)\). This space is denoted \(\mathbb{W}\). (Notation Abuse: \(\mathcal{D}\) is used for the tilted version \(T(\mathcal{D})\)).
  • Focus: We analyze one homological dimension \(k\) at a time, assuming independence between dimensions (Model M1). (Notation Abuse: We write \(\mathcal{D}_X\) for \(\mathcal{D}^k_X\))
  • Goal: Develop a Bayesian framework to update our belief about a “true” underlying (latent) PD, represented by a prior intensity \(\lambda\), using one or more observed PDs \(\mathcal{D}_Y\).

Bayesian Framework: Analogy

Bayesian framework for RVs Bayesian framework for random PDs
Prior Modeled by a prior density \(f\) Modeled by a Poisson PP with prior intensity \(\lambda\)
Likelihood Depends on observed data Stochastic kernel \(\color{purple}{\ell(y|x)}\) depends on observed PDs
Posterior Compute the posterior density Defines a Poisson PP with posterior intensity
  • Core Idea: Treat unknown, latent PD \(\mathcal{D}^X\) as random draw from PPP defined by prior intensity \(\lambda_{\mathcal{D}^X}\). Use observed PDs \(D_{Y^1}, ..., D_{Y^m}\) (for fixed \(k\)) and likelihood kernel \(\color{purple}{\ell(y|x)}\) to get posterior intensity \(\lambda_{\mathcal{D}_X | D_{Y^{1:m}}}\).

The Model: Latent PD \(\mathcal{D}_X\)

  • The “true” underlying PD \(\mathcal{D}_X\) is modeled as a Poisson Point Process (PPP) with a prior intensity density \(\lambda_{\mathcal{D}_X}(x)\).
  • Any potential feature \(x\) from the prior might be observable or vanish (unobserved).
    • \(\alpha(x)\): Probability that feature \(x\) is potentially observable.
  • \(\mathcal{D}_X\) is decomposed into two independent PPPs:
    • \(\mathcal{D}_{XO}\) (Observed Track): Points that could generate an observation.
      • Intensity: \(\color{blue}{\alpha(x) \lambda_{\mathcal{D}_X}(x)}\)
    • \(\mathcal{D}_{XV}\) (Vanished Track): Points that are missed / not observed.
      • Intensity: \(\color{red}{(1-\alpha(x)) \lambda_{\mathcal{D}_X}(x)}\)

The Model: Observed PD \(\mathcal{D}_Y\)

  • An observed PD \(\mathcal{D}_Y\) is also decomposed into two independent components:
  • \(\mathcal{D}_{YO}\) (Observed from Signal): Points generated from the latent observable points \(\mathcal{D}_{XO}\).
    • The pair \((\mathcal{D}_{XO}, \mathcal{D}_{YO})\) forms a Marked PPP.
    • Connection is via the stochastic kernel \(\color{purple}{\ell(y|x)}\) (the “likelihood”). It gives the probability density of observing \(y \in \mathcal{D}_{YO}\) given a latent point \(x \in \mathcal{D}_{XO}\).
  • \(\mathcal{D}_{YS}\) (Spurious / Noise): Points arising independently from noise, clutter, or unanticipated geometry.
    • Modeled as an independent PPP with intensity \(\color{darkgreen}{\lambda_{\mathcal{D}_{YS}}(y)}\).

Bayes’ Theorem for PDs

  • Goal: Find the posterior intensity \(\lambda_{\mathcal{D}_X | D_{Y^{1:m}}}(x)\) for the latent PD \(\mathcal{D}_X\), given \(m\) independent observed PDs \(D_{Y^1}, \dots, D_{Y^m}\).
  • Let \(D_{Y^{1:m}} = \cup_{i=1}^m D_{Y^i}\). The posterior intensity is:

\[ \lambda_{\mathcal{D}_X | D_{Y^{1:m}}}(x) = \underbrace{\color{red}{(1 - \alpha(x))\lambda_{\mathcal{D}_X}(x)}}_{\text{Prior Vanished Part}} + \underbrace{\frac{1}{m} \alpha(x) \sum_{i=1}^m \sum_{y \in D_{Y^i}} \frac{ \color{purple}{\ell(y|x)}\color{blue}{ \lambda_{\mathcal{D}_X}(x)}}{\color{darkgreen}{\lambda_{\mathcal{D}_{Y_S}}(y)} + \color{blue}{\int_{\mathbb{W}} \ell(y|u) \alpha(u) \lambda_{\mathcal{D}_X}(u) du}}}_{\text{Update from Observed Points } y} \quad \text{a.s.} \]

Interpretation & Computation: Gaussian Mixtures

  • Posterior: \(\lambda_{\mathcal{D}_X | D_{Y^{1:m}}}(x)\) blends prior belief (observable & unobservable) with updates from observed \(y\) via likelihood \(\color{purple}{\ell(y|x)}\) relative to noise.
  • Key Idea: Likelihood \(\color{purple}{\ell(y|x)}\) acts on PD points (\(x, y \in \mathbb{W}\)), avoiding raw data issues. But posterior calculation (Thm 3.1) can be complex.
  • Solution: Conjugate Priors via Gaussian Mixtures (GMs)
    • Use Gaussian Mixtures (GMs) restricted to \(\mathbb{W}\) (\(\mathcal{N}^*\)) for conjugacy (posterior stays in the same family).
    • Assume (M2’, M3’): Prior \(\lambda_{\mathcal{D}_X}\), Likelihood \(\color{purple}{\ell(y|x)}\) (single \(\mathcal{N}^*\)), Noise \(\color{darkgreen}{\lambda_{\mathcal{D}_{YS}}}\) are Gaussian/GM based.
    • Result (Prop 3.2): Posterior \(\lambda_{\mathcal{D}_X | D_{Y^{1:m}}}(x)\) is also a GM.
    • Benefit: Parameters update via standard Gaussian rules \(\implies\) computationally tractable framework.

Worked Example

Parameters Summary

Data Generation Parameters
Case Data Noise Std Dev (σ_Data)
Case I 0.032
Case II 0.100
Case III 0.316

Simulated Point Clouds

Prior Intensities

Case I Results

Case II Results

Case III Results

Case IV Results

Classification

Real-World Example: Classifying Crystal Lattices

Problem:

  • Classify Body-Centered Cubic (BCC) vs. Face-Centered Cubic (FCC) crystal structures in high-entropy alloys.
  • Data is from Atom Probe Tomography (APT): inherently noisy and sparse (~65% missing data).
  • Existing methods struggle with combined noise/sparsity or require prior knowledge of global structure.

Approach using Persistent Homology (PH):

  • Represent spatial atom configurations as point clouds.
  • Compute Rips filtrations and extract 1-dimensional Persistence Diagrams (PDs).
  • Idea: Topological features (connectedness, holes/voids) captured by PDs differ between BCC (center atom) and FCC (center void, face atoms).
    • (See Figure 10 in paper for example PDs from each class).

Bayesian Classification: Method & Results

Experiment Setup:

  • Data: 200 BCC PDs, 200 FCC PDs.
  • Priors Tested :
    • Prior-1: Class-specific (means from K-means on subset of data).
    • Prior-2: Common “flat” prior (mean=(1,1), \(\sigma= \sqrt{20}\), high variance).
  • Likelihood/Noise Params:
    • Likelihood kernel std dev: \(\sigma_{K} = 0.1\) (referred to as \(\sigma_{D_{YO}}\) in text).
    • Unexpected noise intensity: \(\lambda_{D_{YS}}(x) = 5 \mathcal{N}^*(x; (0, 0), \sigma=\sqrt{0.2})\). (High weight=5 reflects noise features are rare).
  • Evaluation: 10-fold Cross-Validation, Area Under ROC Curve (AUC), Bootstrapped 95% CIs.

Bayesian Classification: Method & Results

Results (AUC):

10-Fold CV AUC Summary (Bootstrapped)
Prior Mean AUC 5th Percentile 95th Percentile
Prior-1 (Class-Specific) 0.941 0.931 0.958
Prior-2 (Common Flat) 0.940 0.928 0.951

Conclusion:

The Bayesian framework achieves near-perfect classification accuracy (Mean AUC ≈ 0.94). Results are robust to the choice of prior (informative vs. flat). Demonstrates the framework’s capability for machine learning tasks on PDs for challenging real-world data.

References

References

Maroulas, Vasileios, Farzana Nasrin, and Christopher Oballe. 2020. “A Bayesian Framework for Persistent Homology.” SIAM J. Math. Data Sci. 2 (1): 48–74.